Search CORE

38 research outputs found

Compulsory Flow Q-Learning: an RL algorithm for robot navigation based on partial-policy and macro-states

Author: COSTA Anna Helena Reali
SILVA Valdinei Freire da
Publication venue: Sociedade Brasileira de Computação
Publication date
Field of study

Reinforcement Learning is carried out on-line, through trial-and-error interactions of the agent with the environment, which can be very time consuming when considering robots. In this paper we contribute a new learning algorithm, CFQ-Learning, which uses macro-states, a low-resolution discretisation of the state space, and a partial-policy to get around obstacles, both of them based on the complexity of the environment structure. The use of macro-states avoids convergence of algorithms, but can accelerate the learning process. In the other hand, partial-policies can guarantee that an agent fulfils its task, even through macro-state. Experiments show that the CFQ-Learning performs a good balance between policy quality and learning rate.Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES)GRICESFAPESPCNP

General detection model in cooperative multirobot localization

Author: BIANCHI Reinaldo Augusto da Costa
COSTA Anna Helena Reali
ODAKURA Valguima Victoria Viana Aguiar
Publication venue: Sociedade Brasileira de Computação
Publication date: 01/01/2009
Field of study

The cooperative multirobot localization problem consists in localizing each robot in a group within the same environment, when robots share information in order to improve localization accuracy. It can be achieved when a robot detects and identifies another one, and measures their relative distance. At this moment, both robots can use detection information to update their own poses beliefs. However some other useful information besides single detection between a pair of robots can be used to update robots poses beliefs as: propagation of a single detection for non participants robots, absence of detections and detection involving more than a pair of robots. A general detection model is proposed in order to aggregate all detection information, addressing the problem of updating poses beliefs in all situations depicted. Experimental results in simulated environment with groups of robots show that the proposed model improves localization accuracy when compared to conventional single detection multirobot localization.FAPESPCNP

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Recommended from our members

Agents Teaching Agents: A Survey on Inter-agent Transfer Learning

Author: Helena Reali Costa Anna
Leno Da Silva Felipe
Stone Peter
Warnell Garrett
Publication venue: Autonomous Agents and Multi-Agent Systems
Publication date: 01/01/2020
Field of study

Autonomous Agents and Multi-Agent Systems published a piece about the Inter-agent Transfer Learning in January 2020.Office of the VP for Researc

Texas ScholarWorks

Markov decision processes for ad network optimization

Author: Costa Anna Helena Reali
Cozman Fabio Gagliardi
Silva Valdinei Freire da
Truzzi Flávio Sales
Publication venue: Paraná
Publication date
Field of study

In this paper we examine a central problem in a particular advertising\ud scheme: we are concerned with matching marketing campaigns that produce\ud advertisements (“ads”), to impressions — where “impression” is a general term\ud for any space in the internet that can display an ad. In this paper we propose a\ud new take on the problem by resorting to planning techniques based on Markov\ud Decision Processes, and by resorting to plan generation techniques that have\ud been developed in the AI literature. We present a detailed formulation of the\ud Markov Decision Process approach and results of simulated experimentsAnna Helena Reali Costa and F ́ abio Gagliardi Cozman are partially supported by CNPq. Fl ́ avio Sales Truzzi is supported by CAPES. The work reported here has received sub- stantial support through FAPESP grant 2008/03995-5 and FAPESP grant 2011/19280-

Reinforcement Learning Applied to Trading Systems: A Survey

Author: Costa Anna Helena Reali
Del-Moral-Hernandez Emilio
Felizardo Leonardo Kanashiro
Paiva Francisco Caio Lima
Publication venue
Publication date: 01/11/2022
Field of study

Financial domain tasks, such as trading in market exchanges, are challenging and have long attracted researchers. The recent achievements and the consequent notoriety of Reinforcement Learning (RL) have also increased its adoption in trading tasks. RL uses a framework with well-established formal concepts, which raises its attractiveness in learning profitable trading strategies. However, RL use without due attention in the financial area can prevent new researchers from following standards or failing to adopt relevant conceptual guidelines. In this work, we embrace the seminal RL technical fundamentals, concepts, and recommendations to perform a unified, theoretically-grounded examination and comparison of previous research that could serve as a structuring guide for the field of study. A selection of twenty-nine articles was reviewed under our classification that considers RL's most common formulations and design patterns from a large volume of available studies. This classification allowed for precise inspection of the most relevant aspects regarding data input, preprocessing, state and action composition, adopted RL techniques, evaluation setups, and overall results. Our analysis approach organized around fundamental RL concepts allowed for a clear identification of current system design best practices, gaps that require further investigation, and promising research opportunities. Finally, this review attempts to promote the development of this field of study by facilitating researchers' commitment to standards adherence and helping them to avoid straying away from the RL constructs' firm ground.Comment: 38 page

arXiv.org e-Print Archive

Realidade Virtual: Estereoscopia na Educação

Author: Amorim Antonio Carlos O.
Arnaut Rodrigo Dias
Costa Anna Helena Reali
Kofuji Sérgio Takeo
Publication venue: Technical Scientific Journal
Publication date: 07/10/2011
Field of study

Realidade virtual (RV) na educação é um tema fortemente presente nas instituições de pesquisas de vários países. Este artigo discute a aplicação de técnicas de RV, incluindo o uso da computação gráfi ca e a produção de vídeos tridimensionais a partir de equipamentos específi cos, porém de baixo custo para instituições de ensino. A estereoscopia atua como ponto chave para a visualização dessas aplicações. Para o desenvolvimento do projeto, são utilizados uma lente 3D, câmera doméstica, projetores de baixo custo, fi ltros de luz polarizados e óculos 3D passivo. O objetivo da produção do vídeo 3D foi o de avaliar desde os processos envolvidos na elaboração de roteiro, gravação e exibição, até os custos necessários para que uma instituição de ensino adote recursos de realidade virtual para o aprimoramento da aprendizagem

Instituto Federal Santa Catarina: Portal de Periódicos do IFSC

Speeding-up reinforcement learning through abstraction and transfer learning

Author: Costa Anna Helena Reali
Cozman Fabio Gagliardi
Koga Marcelo Li
Silva Valdinei Freire da
Publication venue: Saint Paul, Minnesota
Publication date
Field of study

We are interested in the following general question: is it pos-\ud sible to abstract knowledge that is generated while learning\ud the solution of a problem, so that this abstraction can ac-\ud celerate the learning process? Moreover, is it possible to\ud transfer and reuse the acquired abstract knowledge to ac-\ud celerate the learning process for future similar tasks? We\ud propose a framework for conducting simultaneously two lev-\ud els of reinforcement learning, where an abstract policy is\ud learned while learning of a concrete policy for the problem,\ud such that both policies are refined through exploration and\ud interaction of the agent with the environment. We explore\ud abstraction both to accelerate the learning process for an op-\ud timal concrete policy for the current problem, and to allow\ud the application of the generated abstract policy in learning\ud solutions for new problems. We report experiments in a\ud robot navigation environment that show our framework to\ud be effective in speeding up policy construction for practical\ud problems and in generating abstractions that can be used to\ud accelerate learning in new similar problems.This research was partially supported by FAPESP (2011/ 19280-8, 2012/02190-9, 2012/19627-0) and CNPq (311058/ 2011-6, 305395/2010-6

From Random to Informed Data Selection: A Diversity-Based Approach to Optimize Human Annotation and Few-Shot Learning

Author: Alcoforado Alexandre
Bueno Bárbara Dias
Costa Anna Helena Reali
Fama Israel Campos
Ferraz Thomas Palmeira
Lavado Arnold Moya
Okamura Lucas Hideki
Veloso Bruno
Publication venue
Publication date: 23/01/2024
Field of study

A major challenge in Natural Language Processing is obtaining annotated data for supervised learning. An option is the use of crowdsourcing platforms for data annotation. However, crowdsourcing introduces issues related to the annotator's experience, consistency, and biases. An alternative is to use zero-shot methods, which in turn have limitations compared to their few-shot or fully supervised counterparts. Recent advancements driven by large language models show potential, but struggle to adapt to specialized domains with severely limited data. The most common approaches therefore involve the human itself randomly annotating a set of datapoints to build initial datasets. But randomly sampling data to be annotated is often inefficient as it ignores the characteristics of the data and the specific needs of the model. The situation worsens when working with imbalanced datasets, as random sampling tends to heavily bias towards the majority classes, leading to excessive annotated data. To address these issues, this paper contributes an automatic and informed data selection architecture to build a small dataset for few-shot learning. Our proposal minimizes the quantity and maximizes diversity of data selected for human annotation, while improving model performance.Comment: Accepted at PROPOR 2024 - The 16th International Conference on Computational Processing of Portugues

arXiv.org e-Print Archive

DEBACER: a method for slicing moderated debates

Author: Alcoforado Alexandre
Bustos Enzo
Costa Anna Helena Reali
d’Almeida André Corrêa
Ferraz Thomas Palmeira
Gerber Rodrigo
Müller Naíde
Oliveira André Seidel
Veloso Bruno Miguel
Publication venue: 'Sociedade Brasileira de Computacao - SB'
Publication date: 29/11/2021
Field of study

Subjects change frequently in moderated debates with several participants, such as in parliamentary sessions, electoral debates, and trials. Partitioning a debate into blocks with the same subject is essential for understanding. Often a moderator is responsible for defining when a new block begins so that the task of automatically partitioning a moderated debate can focus solely on the moderator's behavior. In this paper, we (i) propose a new algorithm, DEBACER, which partitions moderated debates; (ii) carry out a comparative study between conventional and BERTimbau pipelines; and (iii) validate DEBACER applying it to the minutes of the Assembly of the Republic of Portugal. Our results show the effectiveness of DEBACER.info:eu-repo/semantics/publishedVersio

arXiv.org e-Print Archive

Repositório Institucional da Universidade Católica Portuguesa